A max-margin training of RNA secondary structure prediction integrated with the thermodynamic model

نویسندگان

  • Manato Akiyama
  • Kengo Sato
  • Yasubumi Sakakibara
چکیده

Motivation: A popular approach for predicting RNA secondary structure is the thermodynamic nearest neighbor model that finds a thermodynamically most stable secondary structure with the minimum free energy (MFE). For further improvement, an alternative approach that is based on machine learning techniques has been developed. The machine learning based approach can employ a fine-grained model that includes much richer feature representations with the ability to fit the training data. Although a machine learning based fine-grained model achieved extremely high performance in prediction accuracy, a possibility of the risk of overfitting for such model has been reported. Results: In this paper, we propose a novel algorithm for RNA secondary structure prediction that integrates the thermodynamic approach and the machine learning based weighted approach. Our fine-grained model combines the experimentally determined thermodynamic parameters with a large number of scoring parameters for detailed contexts of features that are trained by the structured support vector machine (SSVM) with the `1 regularization to avoid overfitting. Our benchmark shows that our algorithm achieves the best prediction accuracy compared with existing methods, and heavy overfitting cannot be observed. Availability: The implementation of our algorithm is available at https://github.com/ keio-bioinformatics/mxfold. Contact: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MMKnots: A max-margin model for RNA secondary structure prediction including pseudoknots

Motivation: The ideal algorithm for the prediction of pseudoknotted RNA secondary structures will provide fast and accurate predictions for pseudoknots of arbitrary complexity. However, existing algorithms are typically lacking on one of these three axes. Energy-based methods suffer from the intractability of pseudoknotted structure prediction under realistic energy models, while statistical ap...

متن کامل

PreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars

Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...

متن کامل

Relation Between RNA Sequences, Structures, and Shapes via Variation Networks

Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...

متن کامل

CONTRAfold: RNA secondary structure prediction without physics-based models

MOTIVATION For several decades, free energy minimization methods have been the dominant strategy for single sequence RNA secondary structure prediction. More recently, stochastic context-free grammars (SCFGs) have emerged as an alternative probabilistic methodology for modeling RNA structure. Unlike physics-based methods, which rely on thousands of experimentally-measured thermodynamic paramete...

متن کامل

A Fugacity Approach for Prediction of Phase Equilibria of Methane Clathrate Hydrate in Structure H

In this communication, a thermodynamic model is presented to predict the dissociation conditions of structure H (sH) clathrate hydrates with methane as help gas. This approach is an extension of the Klauda and Sandler fugacity model (2000) for prediction of phase boundaries of sI and sII clathrate hydrates. The phase behavior of the water and hydrocarbon system is modeled using the Peng-Robinso...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017